AITopics | layer neuron

Collaborating Authors

layer neuron

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Experiment on creating a neural network with weights determined by the potential of a simulated electrostatic field

Polad, Geidarov

arXiv.org Artificial IntelligenceJul-8-2025

This paper explores the possibility of determining the weights and thresholds of a neural network using the potential -- a parameter of an electrostatic field -- without analytical calculations and without applying training algorithms. The work is based on neural network architectures employing metric recognition methods. The electrostatic field is simulated in the Builder C++ environment. In the same environment, a neural network based on metric recognition methods is constructed, with the weights of the first-layer neurons determined by the values of the potentials of the simulated electrostatic field. The effectiveness of the resulting neural network within the simulated system is evaluated using the MNIST test dataset under various initial conditions of the simulated system. The results demonstrated functional viability. The implementation of this approach shows that a neural network can obtain weight values almost instantaneously from the electrostatic field, without the need for analytical computations, lengthy training procedures, or massive training datasets.

artificial intelligence, machine learning, neural network, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.3103/S0147688222050161

2507.02933

Country: Asia (0.46)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Neural networks with image recognition by pairs

Geidarov, Polad

arXiv.org Artificial IntelligenceJun-10-2025

Neural networks based on metric recognition methods have a strictly determined architecture. Number of neurons, connections, as well as weights and thresholds values are calculated analytically, based on the initial conditions of tasks: number of recognizable classes, number of samples, metric expressions used. This paper discusses the possibility of transforming these networks in order to apply classical learning algorithms to them without using analytical expressions that calculate weight values. In the received network, training is carried out by recognizing images in pairs. This approach simplifies the learning process and easily allows to expand the neural network by adding new images to the recognition task. The advantages of these networks, including such as: 1) network architecture simplicity and transparency; 2) training simplicity and reliability; 3) the possibility of using a large number of images in the recognition problem using a neural network; 4) a consistent increase in the number of recognizable classes without changing the previous values of weights and thresholds.

artificial intelligence, machine learning, pattern recognition, (17 more...)

arXiv.org Artificial Intelligence

2506.06322

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.41)

Add feedback

Evaluating Neuron Explanations: A Unified Framework with Sanity Checks

Oikarinen, Tuomas, Yan, Ge, Weng, Tsui-Wei

arXiv.org Artificial IntelligenceJun-9-2025

Understanding the function of individual units in a neural network is an important building block for mechanistic interpretability. This is often done by generating a simple text explanation of the behavior of individual neurons or units. For these explanations to be useful, we must understand how reliable and truthful they are. In this work we unify many existing explanation evaluation methods under one mathematical framework. This allows us to compare existing evaluation metrics, understand the evaluation pipeline with increased clarity and apply existing statistical methods on the evaluation. In addition, we propose two simple sanity checks on the evaluation metrics and show that many commonly used metrics fail these tests and do not change their score after massive changes to the concept labels. Based on our experimental and theoretical results, we propose guidelines that future evaluations should follow and identify a set of reliable evaluation metrics.

explanation, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2506.05774

Country: North America > United States (0.46)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

A comparative analysis of a neural network with calculated weights and a neural network with random generation of weights based on the training dataset size

Geidarov, Polad

arXiv.org Artificial IntelligenceJun-2-2025

The paper discusses the capabilities of multilayer perceptron neural networks implementing metric recognition methods, for which the values of the weights are calculated analytically by formulas. Comparative experiments in training a neural network with pre-calculated weights and with random initialization of weights on different sizes of the MNIST training dataset are carried out. The results of the experiments show that a multilayer perceptron with pre-calculated weights can be trained much faster and is much more robust to the reduction of the training dataset.

artificial intelligence, machine learning, neural network, (16 more...)

arXiv.org Artificial Intelligence

2505.23876

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.97)

Add feedback

Seeking Next Layer Neurons' Attention for Error-Backpropagation-Like Training in a Multi-Agent Network Framework

Moakhar, Arshia Soltani, Azizmalayeri, Mohammad, Mirzaei, Hossein, Manzuri, Mohammad Taghi, Rohban, Mohammad Hossein

arXiv.org Artificial IntelligenceOct-15-2023

Despite considerable theoretical progress in the training of neural networks viewed as a multi-agent system of neurons, particularly concerning biological plausibility and decentralized training, their applicability to real-world problems remains limited due to scalability issues. In contrast, error-backpropagation has demonstrated its effectiveness for training deep networks in practice. In this study, we propose a local objective for neurons that, when pursued by neurons individually, align them to exhibit similarities to error-backpropagation in terms of efficiency and scalability during training. For this purpose, we examine a neural network comprising decentralized, self-interested neurons seeking to maximize their local objective -- attention from subsequent layer neurons -- and identify the optimal strategy for neurons. We also analyze the relationship between this strategy and backpropagation, establishing conditions under which the derived strategy is equivalent to error-backpropagation. Lastly, we demonstrate the learning capacity of these multi-agent neural networks through experiments on three datasets and showcase their superior performance relative to error-backpropagation in a catastrophic forgetting benchmark.

error-backpropagation-like training, neural network, neuron, (14 more...)

arXiv.org Artificial Intelligence

2310.09952

Country:

North America > Canada > Ontario > Toronto (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
Asia > Middle East > Iran > Tehran Province > Tehran (0.04)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

A Missing Value Filling Model Based on Feature Fusion Enhanced Autoencoder

Liu, Xinyao, Du, Shengdong, Li, Tianrui, Teng, Fei, Yang, Yan

arXiv.org Artificial IntelligenceAug-3-2023

With the advent of the big data era, the data quality problem is becoming more critical. Among many factors, data with missing values is one primary issue, and thus developing effective imputation models is a key topic in the research community. Recently, a major research direction is to employ neural network models such as self-organizing mappings or automatic encoders for filling missing values. However, these classical methods can hardly discover interrelated features and common features simultaneously among data attributes. Especially, it is a very typical problem for classical autoencoders that they often learn invalid constant mappings, which dramatically hurts the filling performance. To solve the above-mentioned problems, we propose a missing-value-filling model based on a feature-fusion-enhanced autoencoder. We first incorporate into an autoencoder a hidden layer that consists of de-tracking neurons and radial basis function neurons, which can enhance the ability of learning interrelated features and common features. Besides, we develop a missing value filling strategy based on dynamic clustering that is incorporated into an iterative optimization process. This design can enhance the multi-dimensional feature fusion ability and thus improves the dynamic collaborative missing-value-filling performance. The effectiveness of the proposed model is validated by extensive experiments compared to a variety of baseline methods on thirteen data sets.

dataset, neuron, radial basis function neuron, (13 more...)

arXiv.org Artificial Intelligence

2208.13495

Country:

Asia > China > Beijing > Beijing (0.05)
Asia > China > Guangdong Province > Shenzhen (0.04)
North America > United States > California > Orange County > Irvine (0.04)
Asia > China > Sichuan Province > Chengdu (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology (0.46)
Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

CLIP-Dissect: Automatic Description of Neuron Representations in Deep Vision Networks

Oikarinen, Tuomas, Weng, Tsui-Wei

arXiv.org Artificial IntelligenceJun-5-2023

In this paper, we propose CLIP-Dissect, a new technique to automatically describe the function of individual hidden neurons inside vision networks. CLIP-Dissect leverages recent advances in multimodal vision/language models to label internal neurons with open-ended concepts without the need for any labeled data or human examples. We show that CLIP-Dissect provides more accurate descriptions than existing methods for last layer neurons where the ground-truth is available as well as qualitatively good descriptions for hidden layer neurons. In addition, our method is very flexible: it is model agnostic, can easily handle new concepts and can be extended to take advantage of better multimodal models in the future. Finally CLIP-Dissect is computationally efficient and can label all neurons from five layers of ResNet-50 in just 4 minutes, which is more than 10 times faster than existing methods. Our code is available at https://github.com/Trustworthy-ML-Lab/CLIP-dissect. Finally, crowdsourced user study results are available at Appendix B to further support the effectiveness of our method.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2204.10965

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Depth Separation with Multilayer Mean-Field Networks

Ren, Yunwei, Zhou, Mo, Ge, Rong

arXiv.org Artificial IntelligenceApr-3-2023

Depth separation--why a deeper network is more powerful than a shallower one-- has been a major problem in deep learning theory. Previous results often focus on representation power. For example, Safran et al. (2019) constructed a function that is easy to approximate using a 3-layer network but not approximable by any 2-layer network. In this paper, we show that this separation is in fact algorithmic: one can learn the function constructed by Safran et al. (2019) using an overparameterized network with polynomially many neurons efficiently. Our result relies on a new way of extending the mean-field limit to multilayer networks, and a decomposition of loss that factors out the error introduced by the discretization of infinite-width mean-field networks. One of the mysteries in deep learning theory is why we need deeper networks. In particular, seminal works of Eldan & Shamir (2016); Safran et al. (2019) constructed a simple function (f However, these results are only about the representation power of neural networks and do not guarantee that training a deep neural network from reasonable initialization can indeed learn such functions. To analyze the training dynamics, we develop a new framework to generalize mean-field analysis of neural networks (Chizat & Bach, 2018; Mei et al., 2018) to multiple layers. As a result, all the layer weights can change significantly during the training process (unlike many previous works on neural tangent kernel or fixing lower-layer representations). Our analysis also gives a decomposition of loss that allows us to decouple the training of multiple layers. In the remainder of the paper, we first introduce our new framework for multilayer mean-field analysis, then give our main result and techniques. We discuss several related works in the algorithmic aspect for depth separation in Section 1.3. Similar to standard mean-field analysis, we first consider the infinite-width dynamics in Section 3, then we discuss our new ideas in discretizing the result to a polynomial-size network (see Section 4). We propose a new way to extend the mean-field analysis to multiple layers.

artificial intelligence, machine learning, stage 1, (18 more...)

arXiv.org Artificial Intelligence

2304.01063

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

An Exact Poly-Time Membership-Queries Algorithm for Extraction a three-Layer ReLU Network

Daniely, Amit, Granot, Elad

arXiv.org Artificial IntelligenceMar-4-2023

We consider the natural problem of learning a ReLU network from queries, which was recently remotivated by model extraction attacks. In this work, we present a polynomial-time algorithm that can learn a depth-two ReLU network from queries under mild general position assumptions. We also present a polynomial-time algorithm that, under mild general position assumptions, can learn a rich class of depth-three ReLU networks from queries. For instance, it can learn most networks where the number of first layer neurons is smaller than the dimension and the number of second layer neurons. These two results substantially improve state-of-the-art: Until our work, polynomial-time algorithms were only shown to learn from queries depth-two networks under the assumption that either the underlying distribution is Gaussian (Chen et al. (2021)) or that the weights matrix rows are linearly independent (Milli et al. (2019)). For depth three or more, there were no known poly-time results. With the growth of neural-network-based applications, many commercial companies offer machine learning services, allowing public use of trained networks as a black-box. Those networks allow the user to query the model and, in some cases, return the exact output of the network to allow the users to reason about the model's output.

artificial intelligence, machine learning, neuron, (16 more...)

arXiv.org Artificial Intelligence

2105.09673

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)

Genre:

Workflow (0.68)
Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

MPVNN: Mutated Pathway Visible Neural Network Architecture for Interpretable Prediction of Cancer-specific Survival Risk

Roy, Gourab Ghosh, Geard, Nicholas, Verspoor, Karin, He, Shan

arXiv.org Artificial IntelligenceFeb-2-2022

Survival risk prediction using gene expression data is important in making treatment decisions in cancer. Standard neural network (NN) survival analysis models are black boxes with lack of interpretability. More interpretable visible neural network (VNN) architectures are designed using biological pathway knowledge. But they do not model how pathway structures can change for particular cancer types. We propose a novel Mutated Pathway VNN or MPVNN architecture, designed using prior signaling pathway knowledge and gene mutation data-based edge randomization simulating signal flow disruption. As a case study, we use the PI3K-Akt pathway and demonstrate overall improved cancer-specific survival risk prediction results of MPVNN over standard non-NN and other similar sized NN survival analysis methods. We show that trained MPVNN architecture interpretation, which points to smaller sets of genes connected by signal flow within the PI3K-Akt pathway that are important in risk prediction for particular cancer types, is reliable.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1093/bioinformatics/btac636

2202.00882

Country:

Oceania > Australia (0.04)
Europe > United Kingdom (0.04)
Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)

Genre: Research Report > New Finding (0.93)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology > Carcinoma (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback